Minimum Average Distance Triangulations 



Laszlo Kozma 

Universitat des Saarlandes, Saarbriicken, Germany 
kozma@cs . uni-saarland . de 



Abstract. We study the problem of finding a triangulation T of a planar 
point set S such as to minimize the expected distance between two points 
x and y chosen uniformly at random from 5*. By distance we mean the 
length of the shortest path between x and y along edges of T, with 
edge weights given as part of the problem. In a different variant of the 
problem, the points are vertices of a simple polygon and we look for a 
triangulation of the interior of the polygon that is optimal in the same 
sense. We prove that a general formulation of the problem in which the 
weights are arbitrary positive numbers is strongly NP-complete. For the 
case when all weights are equal we give polynomial-time algorithms. In 
the end we mention several open problems. 



1 Introduction 

The problem addressed in this paper is a variant of the classical network design 
problem. In many applications, the average routing cost between pairs of nodes 
is a sensible network characteristic, one that we seek to minimize. If costs are 
additive (e.g., time delay) and symmetric, an edge- weighted, undirected graph 
G is a suitable model of the connections between endpoints. The task is then to 
find a spanning subgraph T of G that minimizes the average distance. Johnson 
et al. [I] study the problem when the total edge weight of T is required to be 
less than a given budget constraint. They prove this problem to be NP-complete, 
even in the special case when all weights are equal and the budget constraint 
forces the solution to be a spanning tree. 

Here we study the problem in a planar embedding: vertices of G are points 
in the plane, edges of G are straight segments between the points and weights 
are given as part of the problem. Instead of limiting the total edge weight of the 
solution, we require the edges of T to be non-intersecting. From a theoretical 
point of view this turns out to be an essential difference: the problem now has 
a geometric structure that we can make use of. As an application we could 
imagine that we wanted to connect n cities with an optimal railroad network 
using straight line connections and no intersections. We now give a more precise 
definition of the problem. 

Given a set of points S = {pi, . . . ,p n } Cl 2 , and weights w : S 2 — !> R, having 
w(x, x) = and w(x, y) = w(y, x), for all x,y £ S, we want to find a geometric, 
crossing-free graph T with vertex set S and edge weights given by w, such that 
the expected distance between two points chosen uniformly at random from S 



is as small as possible. By distance we mean the length of the shortest path 
in T and we denote it by g?t. Since adding an edge cannot increase a distance, 
it suffices to consider maximal crossing- free graphs, i.e., triangulations. We call 
this the Minimum Average Distance Triangulation (madt) problem. 

The previous formulation, if we omit the normalizing factor, amounts to 
finding a triangulation T that minimizes the following quantity: 

W(T)= ]T drfjpupj). 

l<i<j<n 

Similarly, we ask for the triangulation T of the interior of a polygon with 
n vertices that has the minimum value W(T). In this case the triangulation 
consists of all boundary edges and a subset of the diagonals of the polygon. 

We note that in mathematical chemistry, the quantity W(T) is a widely used 
characteristic of molecular structures, known as Wiener index |2l3j . Efficient 
computation of the Wiener index for special graphs, as well as its combinatorial 
properties have been the subject of significant research [41516] . 

Optimal triangulations. Finding optimal triangulations with respect to various 
criteria has been intensively researched in the past decades |7l8j . One particularly 
well-studied problem is minimum weight triangulation (mwt). For polygons, the 
solution of MWT is found by the C(n 3 ) algorithm due to Gilbert |9j and Klincsek 
[10] . a classical example of dynamic programming. For point sets and Euclidean 
weights mwt was proven to be NP-hard by Mulzer and Rote [TT]. For unit 
weights mwt is trivial since all triangulations have the same cost. 

In contrast to both MWT and budgeted network design pQ , MADT is interesting 
even for unit weights. In case of simple polygons, the problem is neither trivial, 
nor NP-hard. The algorithm we give in § 12.21 uses dynamic programming but 
it is much more involved than the 0(n 3 ) algorithm for mwt. Surprisingly, the 
ideas of the mwt algorithm do not seem to directly carry over to the MADT 
problem, which, in fact, remains open for Euclidean weights, even for polygons. 
What makes our criterion of optimality somewhat atypical is that it is highly 
nonlocal. It is nontrivial to decompose the problem into smaller parts and known 
techniques do not seem to help. 

Our results. We study triangulations of point sets and of polygons. In the case 
of equal weights on all allowed edges, we assume w.l.o.g. that the weights are 
equal to one and we refer to the distance as link distance. Using link distance, 
the solution is easily obtained when one point or one vertex can be connected 
to all the others. This is shown in § 12.11 For the more general case of simple 
polygons (when no vertex can be connected to all other vertices) in § 12.21 we 
give an algorithm with a running time of 0(n n ) that uses dynamic program- 
ming. Our approach exploits the geometric structure of the problem, making a 
decomposition possible in this case. 

For general point sets and arbitrary positive, symmetric weights (not nec- 
essarily obeying the triangle inequality) , in § 12.31 we prove the problem to be 



strongly NP-complete, ruling out the existence of an efficient exact algorithm or 
of a fully polynomial time approximation scheme (FPTAS), unless P=NP. The 
hardness proof is a gadget-based reduction from Planar3S AT. Again, the non- 
locality of the cost function makes the reduction somewhat difficult, requiring 
a careful balancing between the magnitudes of edge weights and the size of the 
construction. 

We leave the problem open in the case of Euclidean weights but we present 
the results of computer experiments for certain special cases in §[31 

2 Results 

2.1 Link distance with one-point-visibility 

We call a point set S one-point-visible if one of the points p € S can be connected 
to all the points q 6 S, where p =^ q, using straight segments that do not contain a 
point of S, except at endpoints. This condition is less restrictive than the usual 
generality (no three points collinear). Similarly, we call a polygon one-vertex- 
visible if one of the vertices can be connected to all others with diagonals or 
boundary edges of the polygon. The set of one-vertex-visible polygons includes 
all convex polygons. A fan is a triangulation in which one point or vertex (called 
the fan handle) is connected to all other points or vertices. 

Theorem 1. For a one-vertex-visible polygon every fan triangulation has the 
same average distance and this is the smallest possible. For a one-point-visible 
point set every fan triangulation has the same average distance and this is the 
smallest possible. 

Proof. The smallest possible distance of 1 is achieved for exactly 2n — 3 pairs of 
vertices in polygons and 3n — h — 3 pairs of points in point sets (with h points on 
the convex hull) , for every triangulation (these are the pairs that are connected 
with an edge and all triangulations of the same polygon or point set have the 
same number of edges). In a fan triangulation, all remaining pairs are at distance 
2 from each other: the path between two vertices not connected with an edge 
can go via the fan handle. □ 

2.2 Link distance in simple polygons 

We now look at polygons that do not admit a fan triangulation. It would be 
desirable to decompose the problem and deal with the parts separately. The 
difficulty lies in the fact that when we are triangulating a smaller piece of the 
polygon, the decisions affect not just the distances within that piece but also the 
distances between external vertices. We need to do some bookkeeping of these 
global distances, but first we make some geometric observations. 

Assume that an optimum triangulation T has been found. We use a clockwise 
ordering of the vertices p\ to p n and we denote by pd be the third vertex of the 
triangle that includes pip n . Let us visit the vertices from pi to pd in clockwise 



order (if p\Pd is a boundary edge, then we have no other vertices in between). 
Let p a be the last vertex in this order such that drip a ,Pi) < dr(jPa,Pd)- There 
has to be such a vertex, since p\ itself has this property and pd does not. Let p c 
be the first vertex for which d,T{p Cl Pd) < dx^PcPi)- Again, such a vertex clearly 
exists (in a degenerate case we can have p\ = p a or p c = pd or both). Let pb 
denote the vertex (other than p n ) that is connected to both p\ and pd (unless 
P\Pd is on the boundary). 




Fig. 1: (a) Special vertices of the polygon, (b) Splitting up the polygon. 

On the other side of the triangle PiPdPn we similarly visit the vertices from p d 
to p n and assign the label p e to the last vertex such that drip^Pd) < driPe^Pn) 
and p g to the first vertex such that dT(jp g ,p n ) < dT{p g ,Pd) and we let pf be 



the vertex connected to both pd and p n (Fig. 1(a) ). Now we can observe some 
properties of these vertices. 

Lemma 1. Let 1 < k < d. Then the following hold (analogous statements hold 
for d < k < n): 

(a) d T (pk,Pi) < d T {pk,Pd) iff 1 < k < a. 

(b) dr(pk,Pi) > d T {pk,Pd) iff c < k < d. 

(c) dr(pk,Pi) = dxipk^Pd) iff a < k < c. In particular, if Pb exists, then 
a < b < c. Otherwise a = 1, c = d = 2, and p\pi is on the boundary. 

Proof, (a) The largest index k for which dr{pk,Pi) < dT{pk,Pd) is k = a by the 
definition of p a - For the converse, observe that for all intermediary vertices pk on 
a shortest path between p\ andp a we have dripk^Pi) < driPk^Pd)- Now suppose 
there is a vertex pi, with 1 < I < a, such that driphPd) < dr(pi,p\). Such an 
inequality also holds for all intermediary vertices on the shortest path between 
pi and pd- Since the shortest path between pi and pd intersects the shortest path 
between p a and p\ , the common vertex has to be at the same time strictly closer 
to p\ and closer or equal to pd, a contradiction. 

(b) Similar argument as for (a). 

(c) First, observe that a < c, otherwise some vertex would have to be strictly 
closer to both p\ and pa, a contradiction. Then, since for 1 < k < c we have 



dr(pk,Pi) < d T (p k ,p d ) and for a < k < d we have d T (pk,Pd) < d T (pk,Pi), it 
follows that for indices in the intersection of the two intervals (a < k < c) we 
have dr(pk,Pd) = dr(pk,Pi)- The converse follows from (a) and (b). Also, we 
have dr(pb,Pi) = dr(Pb,Pd) = 1- □ 

Equipped with these facts, we can split the distance between two vertices on 
different sides of the pipaPn triangle into locally computable components. 

Let 1 < x < d. Consider the shortest path between p x and pd- Clearly, for 
all vertices pk on this path 1 < k < d holds, otherwise the path would go via 
the edge p\p n and it could be shortened via p\Pd- Similarly, given d < y < n, 
for all vertices pk on the shortest path between p y and p n , we have d < k < n. 
We conclude that dT{p x ,Pd) and dT(p y ,p n ) only depend on the triangulations of 
(pi, . . . ,pd) and (pd, ■ ■ ■ ,p n ) respectively. We now express the global distance 
driPx-iPy) in terms of these two local distances. 

Lemma 2. Let pi, . . . ,p n defined as before, 1 < x < d, and d < y < n, and 
let cf> = dxiPxiPd) + driPyiPn)- Then the following holds, covering all possible 
values of x and y: 

{4>-l ifd<y<e; 
4> + l if g < y < n and a < x < d; 
4> otherwise . 

Proof. In each of the cases we use Lemma [T] to argue about the possible ways in 
which the shortest path can cross the triangle piPdPn- For example, if d < y < e, 
the shortest path goes through pd, therefore we have dT{p x ,Py) = dripxiPd) + 
dT(p y ,Pd)- Since dr{p y ,Pd) = dT(p y ,p n ) - 1, we obtain dr(px,P y ) = <$> — l. The 
other cases use similar reasoning and we omit them for brevity. □ 

Lemma [5] allows us to decompose the problem into parts that can be solved 
separately. We proceed as follows: we guess a triangle piPdPn that is part of the 
optimal triangulation and we use it to split the polygon in two. We also guess 
the special vertices p a ,Pc,Pe,P g - We recursively find the optimal triangulation of 
the two smaller polygons with vertices (px, . . . ,pd) and (pd, ■ ■ ■ ,Pn)- Besides the 
distances within the subpolygons we also need to consider the distances between 
the two parts. Using Lemma[5]we can decompose these distances into a distance 
to pd in the left part, a distance to p n in the right part and a constant term. 

We now formulate an extended cost function Wext, that has a second term 
for accumulating the distances to endpoints that result from splitting up global 
distances. The coefficient a G N will be uniquely determined by the sizes of the 
polygons, which in turn are determined by the choice of the index d. We express 
this new cost function for a general subpolygon (pj , . . . , pj) : 

j 

WextCT, a) = y_\ d T (p xl p y ) + a V d T {p x ,Pj)- 

i — — 



Observe that minimizing Wext(7 1 , 0) 



solves the initial problem. Using Lemma[2] 



we can split the sums and the distances until we can express Wext recursively 



in terms of smaller polygons and the indices of special vertices a,c,e,g. Note 
that p a , p c , Pe, P g play the same role as in the earlier discussion, but now the 
endpoints are pi and pj instead of p\ and p n . 

j 

WEXT(T, a) = Y d T (px,Py) + Y d T (px,Py) + Y d T(Px,Py) 

i<x<y<d d<x<y<j i<x<d 

d<y<j 

- Y d T (p x ,Pd) + oc Y d T {p x ,Pj) + ol Y d T (p x ,pj) - <x- d T {pd-,Pj) 

i< x Hiij i<x<d d<x<j 

= W E xt(T, a + j-d) d + W E xt(T, a + d - i) 3 

i d 

+ (a + j -g + l)(d - a - 1) + (e - d + - d). 

How can we make sure that the constraints imposed by the choice of the 
special vertices p a , p c , p ei p g are respected by the recursive subcalls? If the left 
side of the triangle is on the boundary (d = i + it follows that a = i and 
c = d and it is trivially true that dT(p a ,Pi) < d T(p a ,Pd)- Similarly, if d = j — 1, 
it follows that e = d and g = j, therefore dT(p e ,Pd) < dxiPcPj)- The general 
case remains, when one of the sides of the triangle is not on the boundary. 
The following lemma establishes a necessary and sufficient condition for the 
constraints to hold. We write it only for the side PiPd and special indices a and 
c, a symmetric argument works for the side PdPj and special indices e and g. 

Lemma 3. Let Pi,...,Pj be a triangulated polygon. Assume that the triangula- 
tion contains the triangles PiPdPj and PiPbPd- Then the following hold: 

(a) We have a as the largest index (i < a < d) for which driPajPi) < dxiPa^Pd) 
iff a + 1 is the smallest index (i < a + 1 < b) such that dT(p a +i,Pb) < 

d T (Pa+l,Pi)- 

(b) We have c as the smallest index (i < c < d) for which driPcPd) < driPcPi) 
iff c — 1 is the largest index (b < c — 1 < d) such that dT(p c -i,Pb) < 
d T (p c -i,Pd)- 

Proof, (a) If a is the largest index such that dT(p a ,Pi) < dxiPcPd) then a + 1 
is the smallest index such that dr(Pa+i,Pd) < dr(p a +i,Pi)- Since a + 1 < b, 
the shortest path between p a +i and pd contains pb, therefore dripa+iiPb) < 
driPa+iiPi)- To see that a + 1 is the smallest index with this property, we need 
to prove that dr(pk,Pb) > dr(pk,Pi) for all i < k < a. This inequality follows 
from d T (p k ,p d ) > d T (p k ,pi) and d T (p k ,p d ) = d T (p k ,p b ) + 1. 

For the converse, assume a+l to be the smallest index such that dT(p a +i,Pb) < 
dr(p a +i,Pi)- Then d T (p a ,p b ) > d T (j3 a ,pj). Since d T (p a ,p d ) = d T (p a ,p b ) + 1, it 
follows that dxiPcnPi) < dT(p a ,Pd)- To see that a is the largest index with this 
property, we need dT(pk,Pi) > dT(p k ,Pd) for all a < k < b (for k > b the 
inequality clearly holds). Again, this follows from dT(pk,Pi) > dT{p k ,Pb) and 

dr{Pk,Pd) = d T {pk,Pb) + 1- 
(b) Similarly. □ 



Procedure EXT (see Fig. [5]) returns the cost Wext of a triangulation that 
minimizes this cost. We can modify our procedure without changing the asymp- 
totic running time, such as to return the actual triangulation achieving minimum 
cost. The results are then merged to form the full solution. 

Lemma |3] tells us that in all recursive calls two of the four special vertices are 
fixed and we only need to guess the remaining two. We label these new vertices 
as p' a , p' g and p", p'g. They play the same role in their respective small polygons 
as a and g in the large polygon (see Fig. 1(b) for illustration). The notation 
p -f-> q indicates the condition that vertices p and q see each other within the 
polygon. 



procedure EXT ((pi, . . . ,pj), p a ,pc,Pe,p g , a) : 

if (a — i) and (c = e = i + 1) and [g = j = i + 2): 

return 3 + 2oc; /* the polygon has only three vertices */ 
else: 



{ EXT ((£*,...,£<;), p' a ,Pa+l,Pc-l,p' B , <K + j - d) 
+ EXT ((p d , . . . ,pj), p",Pe+l,P g -l,p'g, oc + d-i) 
+ (a + j-g + V)(d - a - 1) + (e -d+ - d)}; 



return miri 

df«''4etl<^<9'% +EXT((p d ,..., Pj ), pi'.Pe+l.PB-l.pJf, « + d 



Fig. 2: Procedure for finding the triangulation that minimizes Wext(T, a). 



The process terminates, since every subcall is on polygons of smaller sizes and 
we exit the recursion on triangles. Clearly every triangulation can be found by 
the algorithm and the correctness of the decomposition is assured by Lemma [2j 
The fact that indices a, c, e, g indeed fulfill their necessary role (and thus the 
expressions in the cost-decomposition are correct) is guaranteed by Lemma 

For the cases when there is no suitable vertex that can be connected to the 
required endpoints and whose index fulfills the required inequalities, we adopt 
the convention that the minimum of the empty set is +co, thus abandoning 
those branches in the search tree. 

Theorem 2. The running time of EXT on a polygon with n vertices is 0(n 11 ). 

Proof. Observe that all polygons in the function calls have contiguous indices, 
therefore we can encode them with two integers between 1 and n. Furthermore, 
if the initial call has a — 0, then it can be shown that on each recursive call 
the parameter a becomes "n minus the number of vertices in the polygon" . For 
this reason it is superfluous to pass <x as an argument. There are four remaining 
parameters which can all take n different values. We can build up a table con- 
taining the return values of all C(n 6 ) possible function calls, starting from the 
smallest and ending with the full polygon. In computing one entry we take the 
minimum of (D(n 5 ) values, giving a total running time of ©(n 11 ). □ 



2.3 Arbitrary positive weights 



We now prove that the decision version of madt for point sets is NP-complctc 
if the edge weigths w are arbitrary positive numbers, such that w(pi,pj) = 
iff i = j and w(pi,pj) = w(pj,pi) for all Such a weight function is called a 
semimetric. We leave open the status of the problem for metric weights (that 
obey the triangle inequality). We defer most of the details of the proofs in this 
section to the Appendix. Where possible, we give a short, intuitive explanation. 

madt (decision version) : For given W* 6 R, is there a triangulation T of a 
given point set S and weights w, such that W(T) < W* ? 

The problem is clearly in NP, since for a given triangulation T, we can use 
an all-pairs shortest path algorithm to compute W(T) and compare it with W* 
in polynomial time. 

We prove NP-hardness using a reduction from Planar3SAT [T2]. In 3SAT, 
given a 3-CNF formula, we ask whether there exists an assignment of truth values 
to the variables such that each clause has at least one true literal. Planar3SAT 
restricts the question to planar formulae: those that can be represented as a 
planar graph in which vertices represent both variables and clauses of the formula 
and there is an edge between clause C and variable x iff C contains either x or 

-iX. 

Knuth and Ragunathan [Tj3] observed that Planar3SAT remains 
NP-complete if it is restricted to formulae embedded in the following fashion: 
variables are arranged on a horizontal line with three-legged clauses on the two 
sides of the line. Clauses and their three legs are properly nested, i.e., none of 
the legs cross each other. We can clearly have an actual embedding in which all 
three legs of a clause are straight lines and the "middle" leg is perpendicular to 
the line of the variables (Appendix: Fig. [7]). For simplicity, let us call such an 
embedding of a formula a planar circuit. 

We put two extra conditions on the admissible planar circuits such that 
Planar3SAT remains NP-complete when restricted to planar circuits obeying 
these conditions: (Rl) no variable appears more than once in the same clause, 
and (R2) every variable appears in at least two clauses. 

Lemma 4. Given a planar circuit 4>i, we can transform it into a planar circuit 
4>2 that obeys Rl and R2, such that cp2 has a satisfying assignment iff <J>i does. 

Proof. We examine every possible way in which Rl or R2 can be violated and 
give transformations that remove the violations while preserving the planar em- 
bedding, as well as the satisfiability of the circuit. □ 

The gadgets used in the reduction are shown in Fig. [3] They consist of points 
in the plane and the weights of the potential edges between them. Weights 
can take one of three values: the value 1, a small value e and a large value 
a (higher than any distance in the triangulation). We call edges with weight a 
irrelevant and we do not show them in the figures. Including irrelevant edges 




Fig. 3: (a) Wire gadget. (b) Simplified variable gadget. (c) Clause gadget. 




Fig. 4: (a) Bridge between variable and clause. (b) Pure triangulations of a variable. 



in a triangulation never decreases the cost W, therefore we can safely ignore 
them. The values e and a depend on the problem size (number of clauses and 
variables) . 

The basic building block is the wire, shown in Fig. EJa). The thick solid 
edges are part of all triangulations. From each pair of intersecting dotted and 
dashed edges exactly one is included in any triangulation. The weights of all 
non- irrelevant edges in a wire are 1. The wire-piece can be bent and stretched 
freely as long as we do not introduce new crossings or remove a crossing between 
two edges (irrelevant edges do not matter). 

The variable is a wire bent into a loop, with the corresponding ends glued 
together. We illustrate this in Fig. EJb) using a wire-piece with 16 vertices. The 
construction works with any 4fc vertices for k > 3 and in fact we will use much 
more than 16 vertices in each variable. 

The clause gadget and the weights of its edges are shown in Fig.[3jc). 

A bridge is a pair of edges that links a variable to a clause. A clause gadget 
has three fixed "places" where bridges are connected to it. We use parallel or 
crossing bridges, as seen in Fig. Ufa). Given a PLANAR3SAT instance (a planar 
circuit), we transform it into an instance of madt as follows: we replace the 
vertices of the planar circuit by variable- or clause gadgets and we replace the 



edge between clause C and variable a; by a parallel bridge if C contains x and 
by a crossing bridge if C contains ->x. 

Lemma 5. Using the gadgets and transformations described above we can rep- 
resent any planar circuit as a MADT instance. □ 

Since the gadgets allow a large amount of flexibility, the proof is straightfor- 
ward. Now we can formulate our main theorem: 

Theorem 3. We can transform any planar circuit <f> into a MADT instance con- 
sisting of a point set S in the plane, a scmimetric weight function w : S 2 —> R 
and a threshold W* , such that S admits a triangulation T with W(T) < W* iff 
<p has a satisfying assignment. All computations can be done in polynomial time 
and the construction is of polynomial size, as are all parameters. 

Corollary 1. MADT with scmimetric weights is strongly NF '-complete. 

The proof of Theorem [3] relies on a sequence of lemmas. The high level idea 
is the following: we call a triangulation of the construction pure, if every variable 
gadget, together with its associated bridges contains either only dashed edges 
or only dotted edges (besides the thick solid edges), see Fig. 0Jb). First we 
show that we only need to consider pure triangulations and thus we can use 
the pure states of the gadgets to encode an assignment of truth values, with 
the convention dotted^ (true), and dashed^ (false). Then, we prove that 
satisfying assignments lead to triangulations with the smallest cost. Finally, we 
bound the difference in cost between different satisfying assignments and we 
show how to generate a baseline triangulation, with cost not far from the cost 
of a satisfying assignment (if one exists). The cost of the baseline triangulation 
can be computed in polynomial time. 

We denote the number of variables in our planar circuit by n v and the number 
of clauses by n c . Due to condition R2, we have n v < 1.5n c . We denote the number 
of vertices in a variable gadget between two bridges (not including the bridge 
endpoints) by N (the same value for all variables). In Fig. Eta), for instance, 
we have N — 14. The proof requires a careful balancing of the parameter N 
describing the size of the construction and the weight e. 

Lemma 6. If N > 5 • 10 5 n c 3 ; for any impure triangulation Ti mpure of the con- 
struction we can find a pure triangulation T pure such that W(T pure ) < W(Ti mpure ) . 

Proof. The main idea is that if a variable is impure then the loop of the gadget is 
necessarily broken in some place and this leads to a penalty in cost that cannot 
be offset by any other change in the triangulation. □ 

Lemma 7. Let Tsat be a triangulation corresponding to a satisfying assign- 
ment of a planar circuit (assuming such an assignment exists) and let T non sAT 
be the triangulation with smallest cost W(T non sAT) among all triangulations cor- 
responding to nonsatisfying assignments. Then, for N > 5 • 10 5 rt c 3 and e < jkz, 

we have W(T nonSAT ) ~ W{T SAT ) >€;■ □ 



Lemma 8. I/Tsati andTsATu are t wo triangulations corresponding to different 
satisfying assignments, then, given previous bounds on N and e, we have that 
\W(T S ati) - W(T SA T2)\ < 150n c 2 N. □ 



Lemma 9. For any T base i ine and any T S at, 
\W(T baseUne ) - W(T SAT )\ < W0n c 2 N. 



given previous bounds, we have 

a 



We can now generate the threshold as W* = W(Tbaseiine) + 300n c 2 7V + 1. In 
accordance with our previous constraints we set N = 10 6 n c 3 , e — w il n & , and 
this ensures that W* falls in the gap between satisfying and nonsatisfying trian- 
gulations. We note that all constructions and computations can be performed in 
polynomial time and all parameters have polynomial magnitude. This concludes 
the proof of NP-completeness. □ 



3 Open questions 

The main unanswered question is the status of the problem for metric, in par- 
ticular Euclidean weights. For polygons we have not succeeded in establishing 
results similar to Lemma[T]that would enable a dynamic programming approach. 
For point sets we suspect that the problem remains NP-hard, but we have not 
found a reduction. 

The Euclidean-distance problem remains open even for the special case of 
regular polygons with n vertices (in this case the boundary of the polygon already 
gives a ^-approximation). Computer simulation shows that the exact solution is 
the fan for n up to 18, except for the cases n = 7 and n — 9, where the solutions 



are shown in Fig. 5(a) We conjecture these to be the only counterexamples. As 
another special case, Fig. |5(b)| shows the solutions obtained for a 2-by-n grid, 
for n up to 16. 




| /N/ I W I // | \ | ^ 
0S2 | / | M/ I W I ^M/S 

HN/N/ I m^ l ^NMZ SZI 

| \ | / | W | / | \ | / | I /N/ I W I // I W I //NHN 

(b) 



Fig. 5: (a) Solution for regular polygons. (b) Solution for 2-by-n grids. 



The problem remains open with unit weights in the case of point sets not ad- 
mitting a fan, even in special cases such as a 3-by-n grid. For the NP-hard variant 



of the problem the question remains whether a polynomial-time approximation 
scheme (PTAS) exists (an FPTAS is impossible, unless P=NP). 

Variants of the problem that we have not studied include placing various 
constraints on the allowed geometric graphs besides non-crossing, such as a total 
budget on the sum of edge weights or bounded vertex-degree, as well as the case 
when Steiner points are allowed. One can also study the problem of maximizing 
the average distance, instead of minimizing it. 
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5 Appendix 

Illustration of Theorem 1. 




Details of the splitting up of the cost-function using Lemma\^ 
Wext(T, a) = y2 d T (p x ,p y ) + a V] d T (p x ,Pj) 

i 

i<x<y<j i< x <j 
= ^2 d T(Px,Py)+ ^2 d T(Px,Py) ~ d T(Px,Pd)+ ^ d T (p x ,p y ) 



i<x<y<d 



d<x<y<j 



d<x<j 



i<x<d 

d<y<j 



^2 d T (p x ,Pd) + u ^2 d T (Px,Pj) + oc ^2 d T (p x ,Pj) - ot- d T (pd,Pj) 

d<x<j 



i<x<d 



i<x<d 



Wext(T,0) +Wext(T,0) 

i 

( 



J2 d T {p x ,Pj)-(e-d+l) + (j -g + l) 



\d<x<j 



{d T (Px,Pd) + d T(p y ,Pj)) - (d - i + l)(e - d+l) + (j-fl + l)(d - a) 



\''<y<./ 



/ 



^2 d T(Px,Pd) + a d T (p x ,Pd) + (d 



i<x<d 



i i<x<d 



a d T (p x ,Pj) - a 



W E xt(T, a + j - d) + W E xt(T, a + d - i) 



+ (a + j - 3 + l)(d - a - 1) + (e - d+ l)(i - d). 



Proof. (Lemma 2]) 

First we give the transformations that establish conditions Rl and R2 while 
preserving the satisfiability of the formula. We use the following notation: x and y 
are variables of the original formula, a, b, c and d are variables introduced during 
the transformation. A clause (iVyVz) is denoted simply as {xyz) and means 
no clause. The values True and False are denoted by T and F, respectively. 




Fig. 7: Planar circuit of the formula (b V V -id) A (a V -.& V d) A (->c V d V e). 



The following replacements cover all possible scenarios in which a variable is 
repeated: 

(xxx) — > (xab) A (x-<ab) A (x-i&c) A (x—ib-ic) (x = T) 

(xx-ix) — ► (x arbitrary) 

(a;-ia;— iar) — ► (a; arbitrary) 

(-ix-ix-ix) — > (->xab) A (— ix-ia6) A (-ix-i6c) A (— >fo— >c) (a: = F) 

(xxy) -> (xy^a) A (a&c) A (a-i&c) A (a-icd) A (a-ic-id) ((x = T) or (y = T)) 
(^x^xy) —¥ (—>xy—ia) A (a&c) A (a-i&c) A (a->cd) A (a-ic-id) 

((x = F) or = T)) 

(xx^y) -)• (x-iy-ia) A (a&c) A (a^&c) A (a-icd) A (a-ic-id) ((a; = T) or (y = F)) 

(^x^x^y) — > (-1X-1J/-10) A (a&c) A (a-i&c) A (a->cd) A (a^c->d) 

((x = F) or (y = F)) 

(x-ixy) — > (1, 1/ arbitrary) 

(x-ix^y) — > (x, y arbitrary) 



If x appears in a single clause, we add two new clauses: 

-> (xa6) A (xa&) 



(x arbitrary) 



Now we show that the transformations maintain the nesting of the clauses, 
i.e., they do not introduce crossings. Removing a clause clearly does not affect 
the embedding. The remaining transformations are of the following type: 

(i) for given x, add (xab) A (x—*ab) A (x->bc) A (x—ib—ic) 

(ii) for given x and y appearing in the same clause, add (xy^a)A(abc) A(a->bc) A 
(a->cd) A (a->c-id) 

(iii) for given x, add (xab) A (xab). 



Figure [5] shows how we can make these changes while maintaining proper 
nesting. In (i) vertices a, b and c are placed near x, so that x can still be linked 
to any other clause. In (ii) vertex a is placed near x, such that the clause (xy->a) 
can take the place of the previous clause in which x and y appeared. We place 
b, c and d near a as we did in (i). In (iii) we place a and b near x. In this way, x 
can still be connected to any clause. □ 




Proof. (Lemma [5]) 

Both the variable and clause gadgets can be rotated, they can be made 
arbitrarily small and the bridges connecting them can be arbitrarily narrow. It 
remains to be shown that edges can emanate from a vertex at any required angle. 
In a planar circuit, we can move the clauses arbitrarily close to the line on which 
the variables lie, thereby making the three angles between the bridges of a clause 
90° — £i, 90° — £2 and 180° + £3, with £1, £2, £3 positive and arbitrarily close to 
zero, such that £1 + £ 2 — £3 = 0. 

We then show by construction that a clause gadget can be stretched such as 
to have two angles of 85° and one angle 190° (Fig. [9]). Since the gadget having 
these angles can be stretched continuously at any bridge to the symmetric gadget 
with all angles having 120°, a clause with angles 90° — £1, 90° — £2, 180° + £3 
can clearly be represented. 

For variable gadgets we can prove something stronger: we can represent any 
angle in the interval (0°, 360°) between neighboring bridges. To see this, consider 
a wire-piece between two neighboring bridges that does not bend at all, in which 
case the bridges are parallel. To produce a full circle, 16 vertices are sufficient, 
as seen in Fig.[3jb). Therefore, if we enforce that the number of vertices between 
two bridges is greater than 16, we can represent any angle. □ 




Fig. 9: A stretched clause. 



Irrelevant edges. Let us count the number of vertices in the resulting madt 
instance. Clause gadgets have 12 vertices each, as seen in Fig. [UJc), therefore 
the total number of vertices due to clauses is 12n c . The number of vertices in 
variable gadgets is 3n c (N + 2) (for all 3n c bridges we have 2 base vertices and 
N vertices separating it from the next bridge in clockwise order). The number 
of all vertices in the construction is thus 18n c + 3n c N which is strictly smaller 
than An c N, assuming that N > 18. 

Since the whole construction is connected and non-irrelevant edge weigths 
are at most 1, the longest possible distance is smaller than An c N . Thus, we can 
set the weights of irrelevant edges a = (4n c iV) 3 > W(T). In this way the irrel- 
evant edges never contribute to the cost, therefore we can simply ignore them. 
Note that the weights a violate the triangle inequality. 



Proof. (Lemma [5]) 

Suppose a variable is in impure state. Start with a dashed edge and follow 
the loop in clockwise order. At some point we switch to dotted edges. Where 
this happens, we have a local structure that we call bubble. There can be two 
different types of bubble, depending on whether the transition occurs at a bridge 
or somewhere else in the wire-piece. As we continue on the loop, we have to switch 
back to dashed edges somewhere. Where this happens, locally we have a hole. 
Figure [TU] shows these local features. It is easy to see that if a variable gadget 
is impure, it has to have at least one hole and at least one bubble. What we 
show is that we can remove a hole and a bubble while decreasing the cost of 
the triangulation. When all holes and bubbles are removed, the triangulation is 
pure. 

We consider a sequence consisting of a hole, a pure piece of wire and a bubble 
(of either type). The pure piece of wire might have several attached bridges 
along the way. We flip each flippable edge of the pure wire-piece (including any 
bridges along the way), thereby absorbing the hole and the bubble at the ends 
(Fig. |10(d)| ) . Now we look at the change in cost due to this operation. We count 



(a) (b) (c) (d) 

Fig. 10: (a) Hole. (b) Bubble. (c) Bubble at bridge. (d) Removing a bubble and 
a hole. 



first the distances that might have become larger, then we count the distances 
which provably became smaller. 

Take any two vertices and the shortest path between them. If a pure wire- 
piece contained in this path is flipped, the distance between the two endpoints 
can increase by at most 2 (we can simulate the old path with the new path and 
two extra edges of length 1 at both ends). If a bridge along the path is flipped 
(or partially deleted, if a bridge-bubble was removed) , this can also increase any 
distance by at most 2 (we can simulate the old bridge using the new bridge and 
two extra edges at the bridge endpoints of length 1 each). If a bridge is nipped, 
this can force a change within the clause to which it is connected. In Fig. [3f c) we 
can verify that in any triangulation of a clause the distance between two vertices 
is less than 5. Therefore, a single clause can increase a path that somehow 
intersects it by less than 5 during this transformation. The total number of 
clauses is n c and the number of bridges 3rt c , therefore the penalty on any distance 
between two points is at most 2 + 3n c • 2 + n c ■ 5 = llri c + 2. The number of 
distances is less than ( 4n | iV ), therefore the total penalty due to removing one 
bubble and one hole is smaller than 88n c 3 N 2 + 16n c 2 N 2 < 100n c 3 iV 2 (assuming 
n c > 1). 

Now we look at distances that provably decrease with the transformation. We 
can assume that between two neighboring bridges the variable gadget contains 
a single hole. If there were more, some vertices would be isolated and we would 
have to cross an irrelevant edge, incurring a cost of a. In this case, removing the 
hole would obviously decrease the cost by making the construction connected 
using only non-irrelevant edges. 

Consider now such a hole on the wire between two bridges (Fig. [TTj) . Denote 
by ri\ the number of vertices between the previous bridge (in clockwise order) 
and the hole and by n.2 the number of vertices between the hole and the next 
bridge. Assume w.l.o.g. that n\ < n^- Depending on the state of the bridges, 
m + ri2 can take the values N, N + 1 or N + 2. It follows that rii > ^ and 
m < y + 1. Among the ri2 vertices after the hole, consider the ^ vertices closest 
to the hole and denote this set by A. On the other side of the other bridge denote 
the vertices closest to the bridge by B (Fig. [TTta)). Assume, for now, that B 



is free of holes. We will treat the case when B has a hole, later. To make things 
simpler, we choose N to be divisible by 16. The following small lemma will help 
us with the computations. Its validity can be seen by a simple analysis of cases 
(see Fig. Ufa) and Fig. |10(d)| . 

Lemma 10. If M is the longest distance between two vertices in a hole-free 
wire-piece with k vertices (possibly containing a bubble), then | < M < -| + 2 
holds. □ 




We know that all variables have at least two bridges (due to R2) and all 
bridges of a variable are connected to different clauses (due to Rl). Let us look 
at the minimum distance from a vertex in A to a vertex in B while the hole is 
still there. In any direction we have to go through a wire containing at least N 
vertices, therefore the distance is at least ^ (due to Lemma ITU)). When the hole 
is removed, the distance between a vertex in A and a vertex in B is at most 
M + 3. 

If there was a hole within the ^ vertices that we labeled as B, we take 
instead the ^ vertices immediately before this hole and label them as B instead 
(Fig. fTTTb)). In this case we can still claim that the distances between A and 
B are at least ^ before the transformation. Now we remove both holes (and 
the two corresponding bubbles), potentially inflicting twice the penalty which 
we bounded from above by 100n c 3 iV 2 . After removing both holes, distances 
between A and B are at most X^r + 3. We get a decrease in cost of at least ^ — 3 

f° r T ' T P arrs - 

If we enforce N > 5 • 10 5 rt c 3 , we get a net decrease in cost due to the removal 
of the hole(s). Therefore, we can transform any impure triangulation into a pure 
one of lower cost. □ 



Proof. (Lemma [7]) 

The high level idea is the following: we first count the distances that can be 
smaller in T non sAT than in Tsat and we bound their contribution to W. Then 
we count those distances that are provably smaller in Tsat than in T non sAT and 
we add up the differences. The crucial fact that makes the proof possible is the 
following: T non sAT nas a t least one clause with all three literals false. For this 
clause, crossing the gadget from one bridge to another has a cost of at least 
1 + 2e. In Tsat clause crossings have cost at most 4e (Fig. [T2")) . Using the fact 
that each clause crossing participates in f](N 2 ) distances, given bounds on N 
and e we obtain the required bound on the difference between the costs. 

Figure [12] shows the optimum triangulation of the clause gadgets in each 
possible assignment, ignoring symmetric cases. We are interested in the distances 
between the endpoints of the three bridges in the clause. These are summarized 
in the bottom row of Fig. [T^J The triangulations are optimal in the sense that 
no other triangulation can achieve a lower distance between any of the bridge 
endpoints. These will be the triangulations used in Tsat, but in T non sAT we 
will implicitly consider other triangulations of the clauses as well. Intuitively, 
it is clear that nonsatisfied clauses {F, F, F} are costlier to cross than satisfied 
ones, and indeed, this is what makes our reduction possible. Now we make this 
intuition more precise. 




T F F F 

2e/\2e 2e/\2e 2e/\^ 1 + 2e /\1 + 2e 

T'—i£-- "*T T'—^-"*T T''- 2 - £ - : *F F'V+2T V 

Fig. 12: (top) Optimal triangulation of a clause. (bottom) Cost of crossing a clause. 



Every distance between two vertices is of one of the following types: (i) be- 
tween vertices of the same clause, (ii) between vertices of the same variable, 
(iii) between a vertex from a clause and a vertex from a variable, (iv) between 
vertices of different clauses, and (v) between vertices of different variables. 

We go through the five types of distances and denote their contribution to 
the cost by W%, . . . , W5. In each of the five cases we want to compare the cost 
of Tsat with the cost of T non sAT- 

(i) Within a single clause, even in the most unfavorable triangulation, the 
distance between any two vertices is less than 5, therefore Wi < nc^)^ = 330n c . 



(ii) Variable gadgets in the two different states are isomorphic, therefore 

W 2 (T S at) = W 2 (T nonS AT). 

(hi) There are less than (An c N)(12n c ) such distances. If we look at one short- 
est path as we move from a satisfying to a non-satisfying assignment, the path 
length can decrease by at most 1 at both endpoints. Variable-crossings and 
bridge-crossings along the way maintain their length and clause-crossings can 
decrease by at most 2e each (compare crossing costs of clauses in Fig. [T2")) . 
Therefore W 3 (T SA t) - W 3 (T nonSAT ) < (2 + 6n c e)(4n c V)(12n c ), which, assum- 
ing e < , is less than 144n c 2 V. 

(iv) By a similar argument, W 4 (T S at) - W 4 (r nonS AT) 
< (2 + 6n c e)(12n c )(12n c ) < 432n c 2 , assuming e < g^-. 

(v) This part is the crucial one, since it contributes the highest order term 
in N to the cost. Our goal is to show that W 5 (T nonSAT ) - W 5 (T S at) = 0(N 2 ), 
outweighing the other four differences which are all O(N). 

Let us look at the distance between two vertices, p x and p y from different 
variable gadgets (Fig. [13]). Let d(p x ,p y ) be their distance in Tsat and d'(p x ,p y ) 
their distance in T non sAT- 




Fig. 13: Shortest path between p x and p y . Variables appear as circles, clauses as tri- 
angles. 

We denote variable gadgets as V\ , . . . , V nv and we write the cost due to 
distances between vertices from different variables as: 

W 5 (T S at) = d (P^Pv)- 

In Fig.[T3]we see that every vertex has a natural neighbor, the vertex to which 
it is connected by a thick solid edge. We denote the neighbors of p x and p y as 
Px~ and p~y~, respectively. We define the distance between two pairs of neighboring 
points as follows: 

d(\PxPt\, \PyPy~}) = d(p x ,Py) + d(p^,p y ) + d{p x ,py-) + d(p^,p^). 

One vertex out every pair of neighbors is a leaf vertex, in the sense that a 
path from that vertex to any other vertex goes through its neighbor. In Fig. [TU 



Fig. 14: Vertices in a bridge-to-bridge portion of a variable gadget. 



left, p^ and po are leaf vertices, but when the variable is nipped into the other 
pure state (Fig. HU right), the situation reverses and p x and po become leaves. 
We can simplify the distance between pairs of points as follows: 

d([PxP^}, [VyPy]) = 4 $(Px,Py) + 4, 

where cf> is the distance between the non-leaf members of both pairs, or more 
precisely: 

4>(Px,Py) =mm(d(p x ,p y ), d(p^,p y ), d(p x ,p^), d(p^,p^)). 
Now we can write the relevant part of the cost in terms of 

Wb(T) = (4>(p*,P y ) + i). 

Let §(p x ,py) denote the distance defined above in Tsat and §'(jp x ,p v ) the 
corresponding distance in T non sAT- We want to bound 4> — cj)' from above. Let us 
decompose (p(p x ,p y ) into components. Remember that <J> is the distance between 
two non-leaf points, i.e., the length of the shortest path between them. Such a 
path goes from p x to a bridge, then crosses a number of bridges, clauses and 
variables, arrives to the target variable, and goes from the bridge to p y . Observe 
that the first and last components (cndpoint to bridge) do not change with the 
flipping of a variable. This can be seen in Fig. 1141 the distance d(p x ,pa) on the 
left and the distance d(p x ~, po) on the right are equal. Variable-crossing costs do 
not change either, a variable bridge-to-bridge portion always has distance -j + 1 
and neither do bridge-crossings which always cost e. 

The only difference in cost between cf) and (+>' is due to clause crossings. 
Whereas T SAT contains only clauses of the type {T,T,T}, {T, T, F}, {T,F,F}, 
in T non sAT we have at least one {F, F, F} clause. Thus, according to Fig. [T2J the 
maximum cost of a crossing in Tsat is 4e and the minimum cost of a crossing in 
TnonSAT is 2e. A shortest path can cross each clause only once, otherwise there 
would exist a shortcut. Since there are n c clauses in total, we obtain the bound: 

fy(Px,Py) < $'(Px,Py) + 2n c e. 

In W5 we have at most ( 3 2 C ) (N + 2) 2 distances, each of which can be shorter 
by at most 2n c e in T n0 nSAT than in Tsat (provided that we group distances 



four-by-four as explained above and we average over the groups). The number 
of distances (assuming N > 12n c 2 ) is less than 4n c 2 N 2 . 

Now let us look at distances that are provably larger in T non sAT than in Tsat- 
We know that there is at least one clause crossing that has cost 1 + 2e in T non sAT 
and cost at most 4e in Tsat- This clause crossing is thus at least 1 — 2e costlier in 
^nonSAT than in Tsat- For every clause crossing there are at least ^ • ^ shortest 
paths going through that crossing, regardless of the states of the variables, i.e., 
both in T non gAT an d in T S at- This fact is illustrated in Fig. [15] (sets A and B). 
In this way we get that there are at least (^) 2 distances that contribute at least 
1 - 2e more to Wr (T nonSAT ) than to W 5 (T SAT ). 



distance > N/2 




Fig. 15: Shortest paths that have to go across a given clause crossing. 

Now we have all the ingredients to compare W(T non sAT) and W(Tsat): 

W(T nonSAT ) - W(T S at) - Wi(T nonSA T) - Wi(Tsat) + W 2 (T nonS A T ) - W 2 (T SA t) 

+ W 3 (T nonSA T) - W 3 (T SA t) + W 4 (T nonSA T) - W 4 (T SA t) 
+ W 5 (T nonSA T) - W 5 (T SA t) 

> - 330n c - lUn c 2 N - 432n c 2 - 8n c 3 N 2 e + (^) 2 (1 - 2s) 

N 2 

> 

- 32 

(assuming N > 5 • 10 5 n c 3 and e < ^) 



□ 



Proof. (Lemma H]) 

We go through the same computations as for Lemma [7] and use the fact that 
between two satisfying assignments a clause crossing can change cost by at most 
Is. □ 



Baseline Triangulation. Our goal is to generate W* somewhere in the gap be- 
tween the costs of satisfying and non-satisfying triangulations. It would be suf- 
ficient to generate a triangulation corresponding to a satisfying assignment and 
add 150n c 2 N to its cost. We do not even know, however, whether a satisfying 
assignment exists. 

Instead, we construct a simpler triangulation that we call baseline: we assign 
to each variable an arbitrary truth value and triangulate the variable gadgets 
and attached bridges accordingly. Then we replace each clause with the baseline 
gadget of Fig. [THl connecting the three vertices of the triangle to the bridge end- 
points that that were supposed to connect to that clause. We note that many 
other configurations would work similarly well as a baseline gadget. For the 
described construction we can compute W(Tbasoiino) using an all-pairs shortest 
path algorithm. 



Proof. (Lemma ^ 

The computations are similar to those in the previous proofs (we look at the 
five different types of distances): 

(i) Within a single clause of ?baseiine the distance between any two vertices is 
less than 2, therefore Wi (Tbaseiine ) < n c ( 12 )2 = 132n c . 

(ii) Here also W 2 (T baseUne ) = W 2 (T SAT ). 

(hi) Here also | W 3 (T basclinc ) - W 3 (T S at)| < (2 + 6ce)(4n c 7V)(12n c ) < U4n c 2 N. 

(iv) Here also | W 4 (T basc i inc ) - W 4 (T S at)| < (12n c )(12n c )(2 + 6n c e) < 432n c 2 . 

(v) Here also, clause crossings can change by at most 2e, therefore | W 5 (T basel i nc ) — 
W 5 (T SA t)| < 8n c 3 iV 2 e. 

Overall we find that |W(T basolinc ) - W(T SAT )| < 150n c 2 N. □ 




Fig. 16: Clause gadget in baseline triangulation. 



